Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification
نویسندگان
چکیده
This paper describes our recent efforts in exploring datadriven high-level features and their combination with low-level spectral features for speaker verification. In particular, we compare the phonetic and data-driven approaches and study their complementarity with short-term acoustic approach. Our objective is to show that data-driven units automatically acquired from the speech data, can be used like phonemes to extract highlevel features and to bring complementary speaker-specific information that can therefore provide improvements when fused with acoustic systems. Results obtained on the NIST 2006 Speaker Recognition Evaluation data show that the combination of the phonetic, data-driven and Gaussian Mixture Models (GMM) systems brings a 27% relative reduction of the EER in comparison to the baseline GMM system.
منابع مشابه
Generalized I-vector Representation with Phonetic Tokenizations and Tandem Features for both Text Independent and Text Dependent Speaker Verification
This paper presents a generalized i-vector representation framework with phonetic tokenization and tandem features for text independent as well as text dependent speaker verification. In the conventional i-vector framework, the tokens for calculating the zeroorder and first-order Baum-Welch statistics are Gaussian Mixture Model (GMM) components trained from acoustic level MFCC features. Yet bes...
متن کاملPhonetic, idiolectal and acoustic speaker recognition
This paper describes a text-independent speaker recognition system that achieves an equal error rate of less than 1% by combining phonetic, idiolect, and acoustic features. The phonetic system is a novel language-independent speakerrecognition system based on differences among speakers in dynamic realization of phonetic features (i.e., pronunciation), rather than spectral differences in voice q...
متن کاملPhonetic Speaker Id
This paper describes the exploration of text-independent speaker identification using novel approaches based on speakers’ phonetic features instead of traditional acoustic features. Different phonetic speaker identification approaches are discussed in this paper and evaluated using two speaker identification systems: one multilingual system and one single language multiple-engine system. Furthe...
متن کاملDNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances
We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...
متن کاملUnsupervised learning of HMM topology for text-dependent speaker verification
Usually, text-dependent speaker verification can achieve better performance than text-independent system because of the constraint that the enrollment and testing utterance share the same phonetic content. However, the enrollment data for text-dependent system usually is very limited. Expectation Maximization(EM) training of HMM will suffer from noisy estimation because of limited enrollment. A...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007